This report provides findings for scouting players that are high performing and not well paid for the team to recruit.

Provide a well commented and clean (knitted) report of your findings that can be presented to your GM. Include a rationale for variable selection, details on your approach and a overview of the results with supporting visualizations.

Variable Selection

The first step is to see which variables are most correlated with salary:

The two most correlated are points (PTS) and assists (AST) with correlations of 0.59 and 0.58, respectively. These will be used for the two centers for the clustering algorithm. Players with both high points and high assists are considered high performing.

Clustering Algorithm Results

Using a clustering algorithm with two clusters (or groups), we obtain the following results:

We can use these results from the clustering algorithm to view salary as well, and identify which players may be high performing and underpaid.

The use of two clusters accounts for 59.2% of the variance in the data. This is ok, but we can improve this with more clusters.

Evaluating different numbers of clusters

2 clusters was ok, but we want more clusters so we can see which players are really high performing. We can use the elbow method to see how much variance can be explained by using 2-10 clusters.

Another method to use to identify the optimal amount of clusters is using the NbClust method. This method identifies the optimal number of clusters under different criteria. Below is a graph of the number of votes for each number of clusters.

We see that the elbow method and the NbClust method yield pretty different results in the optimal number of clusters. We will run the algorithm using 4 clusters to compromise between each method’s results without overfitting.

Results after using 4 clusters:

You can hover over the plot to identify the points, assists, cluster, and player name. Note that this data is normalized; we will look at the true values in the following section.

Player Selection

Yes - players we want

We want players with a high number of points and assists, but low current salaries. On the chart above, this is data points that are in cluster 4 (x’s) but are lighter blue in color. You can see three points that meet these criteria on the top right of the plot. These players are:

  1. TraeYoung
  2. LukaDoni
  3. DeAaronFox

Data for these players:

## # A tibble: 3 x 7
##   Player     Pos     Age Tm      PTS   AST `2020-21`
##   <chr>      <chr> <dbl> <chr> <dbl> <dbl>     <dbl>
## 1 DeAaronFox PG       23 SAC     806   265   8099627
## 2 LukaDoni   PG       21 DAL     916   287   8049360
## 3 TraeYoung  PG       22 ATL     897   321   6571800

All three of these players are point guards - let’s find some other players we would want to hire that play other positions.

  1. DonovanMitchell
  2. ShaiGilgeousAlexander
## # A tibble: 2 x 7
##   Player                Pos     Age Tm      PTS   AST `2020-21`
##   <chr>                 <chr> <dbl> <chr> <dbl> <dbl>     <dbl>
## 1 DonovanMitchell       SG       24 UTA     839   183   5195501
## 2 ShaiGilgeousAlexander SG       22 OKC     697   187   4141320

Both of these players are shooting guards, and are not currently being paid well yet still have high performance. These would be good additions to the team.

No - players we don’t want

Players that the team definitely does not want are those who are already very highly paid, or are not good, or both. Three that we definitely do not want on the team are:

  1. JimmyButler
  2. MikeConley
  3. JohnWall

Each of these players are being paid a lot and have less points and assists than those identified in the previous section. Here are their stats:

## # A tibble: 3 x 7
##   Player      Pos     Age Tm      PTS   AST `2020-21`
##   <chr>       <chr> <dbl> <chr> <dbl> <dbl>     <dbl>
## 1 JimmyButler SF       31 MIA     452   172  34379100
## 2 JohnWall    PG       30 HOU     526   151  41254920
## 3 MikeConley  PG       33 UTA     466   164  34504132

Maybe - these players aren’t great, and have average salaries

Unsure about these players:

  1. GarrettTemple
  2. WillBarton
  3. TJMcConnell
## # A tibble: 3 x 7
##   Player        Pos     Age Tm      PTS   AST `2020-21`
##   <chr>         <chr> <dbl> <chr> <dbl> <dbl>     <dbl>
## 1 GarrettTemple SG       34 CHI     290    68   4767000
## 2 TJMcConnell   PG       28 IND     215   216   3500000
## 3 WillBarton    SF       30 DEN     383   102  13920000